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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 
EFORE THE BOARD OF PATENT APPEALS AND INTERFERENCES 



On Appeal to the Board of 
Appeals and Interference 



Paek et al. 

09/830,899 Group Art Unit: 2171 

August 13, 2001 Examiner: Leroux, Etienne Pierre 

DESCRIPTION SCHEMES FOR MPEG-7 IMAGEA^IDEO 
CONTENTS DESCRIPTION 

BRIEF ON APPEAL 

This brief on appeal is filed in response to an Office Action issued by the 
U.S. Patent and Trademark Office (the "PTO") on July 3, 2006. On November 3, 2006, 
Appellants filed a Notice of Appeal in the above-identified patent application from the rejection 
of claims 1-43. In accordance with 37 C.F.R. § 41.37, this Appeal Brief is submitted in support 
of the Appeal of the rejections of record. The fee for this Appeal, as set forth in 37 C.F.R, 
§ 41.20(b)(2), is provided herewith. 

For the reasons set forth below, the rejections of pending claims 1-43 should be 

reversed. 

1. REAL PARTY IN INTEREST 

The real party in interest is The Trustees of Columbia University in the City of 
New York, by way of assignment from the named inventors, recorded on August 13, 2001 at 
Reel 012068, Frame 0221. 



Applicant : 
Serial No. : 
Filed : 
Title: 
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IL RELATED APPEALS AND INTERFERENCES 

Appellants and the Appellants' legal representatives are unaware of any pending 
appeals or interferences related to the present application which will directly affect or be directly 
affected by, or have a bearing on, the Board's decision in the pending appeal. 
IIL STATUS OF CLAIMS 

In the July 3, 2006 Office Action, claims 1-43 were rejected under 35 U.S.C. 
§ 102(b) as allegedly anticipated by U.S. Patent No. 6,079,566 to Eleftheriadis et al. (hereinafter 
"Eleftheriadis"). Appellants respectfully traverse the rejections of record, 

A copy of all of the pending claims is attached hereto in the Claims Appendix at 

page A-1. 

IV. STATUS OF AMENDMENTS 

Subsequent to the issuance of the Final Official Action dated July 3, 2006, 
no further amendments to the claims have been filed by Appellants. 

V, SUMMARY OF CLAIMED SUBJECT MATTER 

The claimed subject matter described in the above-identified application is 
directed to a method and system for generating a description record from multimedia 
information, (e.g., Specification, page 4, lines 24-27). Specifically, the claimed subject matter 
of the present application has usefiil applications in, e.g., cataloging, indexing and searching 
multimedia content, as is described in more detail below. (Specification, page 4, line 24 - p. 5, 
line 12). 

Provided herein are some non-limiting references to the specification for 
illustrative purposes only. Independent claim 1 is directed to a system for generating a 
description record from multimedia information, comprising: 
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at least one multimedia information input interface receiving said multimedia 
information; /e.g., specification, p. 26 (''Digital image data 710 is applied to the 
computer system via link 71 L "); pp^ 27-28, Fig. 8] 

a computer processor, coupled to said at least one multimedia information 
input interface, receiving said multimedia information therefrom [e.g, 
specification, p. 26 C Digital image data 710 is applied to the computer 
system via link 71 L pp. 27-28, Fig. 8], processing said multimedia 
information by performing object extraction processing [e.g., 
specification, p. 26; Fig. 7, "object extraction 720"; Fig. 3] Xo generate 
multimedia object descriptions [e.g, specification, p. 26, ''object set 721, " 
"object descriptions''; Fig. 5] from said multimedia information, and 
processing said generated multimedia object descriptions by object 
hierarchy processing [e.g., specification, p. 27; Fig. 7, "object hierarchy 
extraction and construction module 730; "p. 28; Fig. 8, module 830] to 
generate multimedia object hierarchy descriptions /e.g., pp. 18-19, 23, 25; 
Figs, 3, 4a, 4b, 5, 6a, 6b] indicative of an organization of said object 
descriptions [e.g., pp. 18-19, 23, 25; Figs. 3, 4a, 4b, 5, 6a, 6b], wherein at 
least one description record including said multimedia object descriptions 
and said multimedia object hierarchy descriptions is generated for content 
embedded within said multimedia information [throughout the 
specification, it is described in numerous instances that the descriptions 
are generated for multimedia content, e.g, p. 5, line 12; Fig, 3]\ and 

a data storage system, operatively coupled to said processor, for storing said at 
least one description record [e.g.. Fig, 7, 740; Fig, 8, 840, and related 
descriptions in specification]. 

(Claim 1). 

Importantly, the claimed subject matter includes the recitation of "performing 
object extraction processing to generate multimedia object descriptions from said multimedia 
information, and processing said generated multimedia object descriptions by object hierarchy 
processing to generate multimedia object hierarchy descriptions indicative of an organization of 
said object descriptions, wherein at least one description record including said multimedia object 
descriptions and said multimedia object hierarchy descriptions is generated for content 
embedded within said multimedia information." (Claim 1) {See, e.g.. Specification pp. 11-14, 
26-29; Figs. 2, 3, 7, 8; and related citations to the specification and drawings as indicated above). 
Similar recitations are recited in independent method claim 17, including, e.g.: 
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receiving said multimedia information [e,g., specification, p. 26 (''Digital image 
data 710 is applied to the computer system via link 711. pp. 27-28, Fig, 8]\ 

processing said multimedia information by performing object extraction 
processing /e.g., specification, p. 26; Fig, 7, ''object extraction 720''; 
Fig. 3] to generate multimedia object descriptions from said multimedia 
information /e.g., specification, p. 26, "object set 721, " "object 
descriptions'']; 

processing said generated multimedia object descriptions by object 
hierarchy processing [e.g., specification, p. 26; Fig. 7, "object extraction 
720"; Fig, 3] to generate multimedia object hierarchy descriptions [e.g, 
specification, p. 26, "object set 721, " "object descriptions'' ; Fig, 5] 
indicative of an organization of said object descriptions, wherein at least 
one description record including said multimedia object descriptions and 
said multimedia object hierarchy descriptions is generated [throughout the 
specification, it is described in numerous instances that the descriptions 
are generated for multimedia content, e.g, p. 5, line 12; Fig. 3] 

storing said at least one description record [e,g,. Fig, 7, 740; Fig, 8, 840, 
and related descriptions in specification] . 

(Claim 17). 

and in independent computer-readable medium claim 33, which includes, inter alia: 

one or more multimedia object descriptions, generated by performing 
object extraction processing /e.g., specification, p. 26; Fig, 7, "object 
extraction 720''; Fig, 3], said object descriptions describing corresponding 
multimedia objects [e.g., specification, p. 26, "object set 721, " "object 
descriptions "] [throughout the specification, it is described in numerous 
instances that the descriptions are generated for multimedia content, e.g, 
p. 5, line 12; Fig. 3]\ 

one or more features characterizing each of said multimedia object 
descriptions; 

one or more multimedia object hierarchy descriptions indicative of an 
organization of said object descriptions [e.g., specification, p. 26, "object 
set 721, " "object descriptions": Fig. 5], if any, relating at least a portion 
of said one or more multimedia objects in accordance with one or more 
characteristics /e.g., specification, p. 26, "object set 721,'' ''object 
descriptions"; Fig. 5]. 

(Claim 33). 
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VL GROUNDS OF REJECTION TO BE REVIEWED ON APPEAL 

Claims 1-43 were rejected under 35 U.S.C. § 102(b) as allegedly anticipated by 
U.S. Patent No. 6,079,566 to Eleftheriadis et al. (hereinafter "Eleftheriadis"). Appellants 
respectfully request review of all rejections of record. 
VII. ARGUMENT 

Preliminarily, Applicants note for the record that Appellants do not acquiesce to 

or otherwise agree with comments in the previous (and now moot) April 13, 2006 Office Action, 

including, in particular, the comments regarding "Priority" on pages 2-3 and the "Response to 

Arguments" on pages 8-11. Because that April 13, 2006 Office Action is now withdrawn in 

favor of the more recently issued July 3, 2006 Office Action, Appellants consider the statements 

regarding priority and the prior art (including the previously-cited Application Publication 

No. 2001/0000962 of Rajan) to be moot and not of record. Accordingly, Appellants focus herein 

only on the present rejections of record in the above-referenced application as set forth in the 

July 3, 2006 Office Action. 

A. The Rejections Under 35 U.S.C. § 102(b) in view of Eleftheriadis 
Should Be Reversed 

In the July 3, 2006 Office Action, claims 1-43 were rejected under 35 U.S.C. 
§ 102(b) as allegedly anticipated by U.S. Patent 6,079,566 to Eleftheriadis et al. (hereinafter 
"Eleftheriadis"). Appellants respectfully traverse the rejections of record. 

1. Relevant Case Law 

To establish an anticipation rejection, the cited reference must teach every 
element of the claimed subject matter. 35 U.S.C. § 102(b) states, in pertinent part, that "[a] 
person shall be entitled to a patent unless the invention was patented or described in a printed 
publication in this or a foreign country or in public use or on sale in this country, more than one 
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year prior to the date of the application for patent in the United States." A patent claim is thus 
anticipated under Section 102 if, among other things, "identity of invention" is shown. 
Minnesota Mining and Manufacturing Co, v. Johnson & Johnson Orthopedics, Inc., 976 F.2d 
1559, 1565, 24 U.S.P.Q.2d 1321 (Fed. Cir. 1985). In finding identity of invention, one "must 
show that each element of the claim in issue is found ... in a single prior art reference." Id, The 
Federal Circuit has held that, "[a] claim is anticipated only if each and every element as set forth 
in the claim is found, either expressly or inherently described, in a single prior art reference." 
Verdegaal Bros, v. Union Oil Co. of California, 814 F.2d 628, 631, 2 U.S.P.Q.2d 1051 (Fed. Cir. 
1987). Moreover, "[a] prior art publication cannot be modified by the knowledge of those 
skilled in the art for purposes of anticipation." In re Saunders, 444 F.2d 599, 602-03, 170 
U.S.P.Q.213(C.C.P.A. 1971). 

2. Summary of Arguments 

Eleftheriadis does not disclose or suggest a technique for generating a description 
record from multimedia information including, among other things, "performing object 
extraction processing to generate multimedia object descriptions," or "processing said generated 
multimedia object descriptions by object hierarchy processing to generate multimedia object 
hierarchy descriptions," as recited in claims 1 and 17, or, as similarly recited in claim 33, 
"multimedia object descriptions, generated by performing object extraction processing," or 
"one or more multimedia object hierarchy descriptions indicative of an organization of said 
object descriptions, if any, relating at least a portion of said one or more multimedia objects in 
accordance with one or more characteristics." A.dditionally, Eleftheriadis does not disclose or 
suggest the claimed "feature extraction processing" of claims 3, 7, 10, 15, 19, 23, 26 and 31 . As 
discussed more fully herein below, because Eleftheriadis fails to disclose or suggest at least these 
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claimed features, Appellants respectfully submit that Eleftheriadis cannot anticipate the claimed 

subject matter and that, accordingly, all rejections of record should be reversed. 

3. Claims 1-43 Are Not Anticipated Because Eleftheriadis Does 
Not Disclose "performing object extraction processing to 
generate multimedia object descriptions^' 

Independent claim 1 is directed to a system for generating a description record 

from multimedia information, comprising, inter alia: 

a computer processor, coupled to said at least one multimedia information 
input interface, receiving said multimedia information therefrom, 
processing said multimedia information by performing object extraction 
processing to generate multimedia object descriptions from said 
multimedia information, and processing said generated multimedia object 
descriptions by object hierarchy processing to generate multimedia object 
hierarchy descriptions indicative of an organization of said object 
descriptions, wherein at least one description record including said 
multimedia object descriptions and said multimedia object hierarchy 
descriptions is generated for content embedded within said multimedia 
information. 

(Claim 1). 

By way of background, the claimed subject matter relates to the MPEG-7 
standard, which comprises techniques for describing and organizing multimedia information (in 
fact, the inventors of the claimed subject matter contributed to the development of that standard 
through participation in a standards-setting body) {See Specification, p. 2, lines 1 1-30). As 
described in the Background of the Invention (starting at p. 1 of the Specification), prior systems 
relate to means for searching textual information, both on the Internet and locally. However, at 
the time of the present application, there was no means in the art for searching multimedia 
content. An aim of MPEG-7 is to process multimedia such as video data to extract information 
about what is shown in the video and provide descriptions that may later aid in searching or 
cataloging the video. "Performing object extraction processing to generate multimedia object 
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descriptions," as recited in the independent claims of the present application (and, by virtue of 

dependency, as is included in all depending claims), is an important procedure for addressing 

shortcomings of the prior art. (See, e.g., Specification, p. 26; Figs. 7 and 8). 

Eleftheriadis is directed to a system and method for processing object-based 

audiovisual information (which relates to a different standard - the MPEG-4 standard) which is 

capable of flexibly encoding, storing and accessing a variety of data objects. See Eleftheriadis, 

Abstract. The "Background of the Invention" portion of Eleftheriadis describes the challenges in 

multimedia coding and storage for graphics, and, in particular, for streaming video. See 

Eleftheriadis, col. 1, lines 17-59. Eleftheriadis addresses those challenges with a system and 

method as described below: 

The invention overcoming these and other problems in the art relates to a 
system, method, and associated medium for processing object-based 
audiovisual information which encodes, stores and retrieves not just 
overall frames, but individual segments containing AV objects which are 
then assembled into a scene according to embedded file information. The 
invention consequently provides very efficient streaming of and random 
access to component AV objects for even complex scenes. 
(Eleftheriadis, col. 1, line 62 - col. 2, line 3). 

Accordingly, Eleftheriadis addresses completely different issues (i.e., 
encoding/unencoding and playback of multimedia information (such as streaming video)) from 
the objects of the claimed subject matter herein (i.e., receiving already-encoded (or already- 
composed) multimedia information (such as streaming video) and indexing/classifying the 
content to facilitate subsequent text and/or other searching of that multimedia information). 
Indeed, this distinction is inherent in the differences between the subject matter of Eleftheriadis 
(e.g., related to MPEG-4 video composition/encoding/presentation) and the subject matter of the 
present invention (e.g., related to MPEG-7 video description), and would be immediately 
apparent to one of ordinary skill in the art. 
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Eleftheriadis does not disclose or suggest at least the feature of "performing 
object extraction processing to generate multimedia object descriptions" as recited in claims 1 
and 17 or "one or more multimedia object descriptions, generated by performing object 
extraction" as recited in claim 33. Indeed, the lack of such disclosure in Eleftheriadis is not 
surprising, since object extraction processing to generate multimedia object descriptions is 
entirely unnecessary for the purposes of MPEG-4 and Eleftheriadis (which are directed to, e.g., 
processing/encoding multimedia information, such as streaming video). 

The Examiner, on pp. 2-3 of the Office Action, maintains that Eleftheriadis 
discloses all elements of claim 1. Appellants respectfully disagree. 

In particular, the Examiner alleges that col. 7, lines 35-40 of Eleftheriadis 
discloses the claimed object extraction (Office Action, p. 2, including a citation to read operation 
module 290, object table 370, and MPEG-4 player 360). However, the entirety of the citation 
provides: 

In the diagram of FIG. 4, CPU 380 accesses storage device 280 (such as a 
hard drive) to cause a read operation to be performed on an MPEG-4 file 
at module 290, and a next segment header is read at module 300. The read 
operation module 290 accesses an object table 370 for translation 
purposes, and communicates extracted audiovisual data to MPEG-4 player 
360, which may comprise a video buffer, screen, audio channels and 
related output devices. ID check module 330 checks for an ID in the 
segment header, transmitting the ID to the Get Object ID module 320, or if 
not present moving back to next segment module 300. After MPEG-4 
player 360 has finished presenting the current audiovisual data, it 
transmits a request through request module 340 for the next AL PDU (ID), 
or may request a random AL PDU (ID) through module 350, which in turn 
communicates that information to the ID check module 310. 
Eleftheriadis, col. 7, lines 34-50. 

This portion of Eleftheriadis is clearly directed to a procedure for reading/playing back MPEG-4 
encoded video from a storage device, and has no relation whatsoever to the feature of 
"performing object extraction processing to generate multimedia object descriptions" as recited 
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in claims 1 and 17 or "one or more multimedia object descriptions, generated by performing 

object extraction" as recited in claim 33. Again, this distinction would be readily apparent to one 

of ordinary skill in the art. 

For at least this reason, Appellants respectfully submit that Eleftheriadis fails to 

disclose or suggest all elements of independent claim 1 and its corresponding depending claims. 

Eleftheriadis therefore cannot properly anticipate the claimed subject matter of claims 1-43 for at 

least these reasons. Appellants respectfully submit that this alone is sufficient basis to reverse all 

rejections of record. 

4. Claims 1-43 Are Not Anticipated Because Eleftheriadis 
Does Not Disclose ^^processing said generated multimedia 
object descriptions by object hierarchy processing to generate 
multimedia object hierarchy descriptions" or ^^multimedia 
object hierarchy descriptions" 

In addition to the limitations described above, claims 1 and 17 also recite 
"processing said generated multimedia object descriptions by object hierarchy processing to 
generate multimedia object descriptions," and claim 33 similarly recites "multimedia object 
hierarchy descriptions." 

As discussed above, because Eleftheriadis fails to disclose or suggest generating 
"muhimedia object descriptions," Eleftheriadis cannot possibly disclose or suggest "processing 
said generated multimedia object descriptions". For at least this reason, this additional limitation 
of claims 1 and 17 is not disclosed or suggested by Eleftheriadis. 

Additionally, the Examiner relies on col. 3, lines 35-40 of Eleftheriadis as 
allegedly disclosing "processing said generated multim.edia object descriptions by object 
hierarchy processing to generate multimedia object hierarchy descriptions," (Office Action, p. 3, 
including citation to a "tree-structured approach"). The full citation provides: 
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In terms of the AL PDU, BIFS and related data structures under MPEG-4, 
that standard uses an object-based approach. Individual components of a 
scene are coded as independent objects (e.g. arbitrarily shaped visual 
objects, or separately coded sounds). The audiovisual objects are 
transmitted to a receiving terminal along with scene description 
information, which defines how the objects should be positioned in space 
and time, in order to construct the scene to be presented to a user. The 
scene description follows a tree structured approach, similar to the Virtual 
Reality Modeling Language (VRML) known in the art. The encoding of 
such scene description information is more fully defined in Part 1 of the 
official ISO MPEG-4 specification (MPEG-4 Systems), known in the art. 
BIFS information is transmitted in its own elementary stream, with its own 
time and clock stamp information to ensure proper coordination of events 
at the receiving terminal. 
Eleftheriadis, col. 3, lines 29-45. 

This lone reference in Eleftheriadis to a "tree-structured approach" is unrelated to the claimed 

"processing said generated multimedia object descriptions by object hierarchy processing to 

generate multimedia object hierarchy descriptions." As explained in detail in the quoted portion 

of Eleftheriadis above, that reference refers to methods described in MPEG-4 and, e.g., VRML, 

for providing multimedia scene information using a tree -structure (for, e.g., streaming video 

playback/scene presentation). However, that tree structure has nothing to do with describing the 

content of the multimedia information for later search/retrieval. Accordingly, the cited portion 

of Eleftheriadis bears no relation to the claimed "processing said generated multimedia object 

descriptions by object hierarchy processing to generate multimedia object hierarchy 

descriptions" as recited in independent claims 1 and 17. For similar reasons, the "one or more 

multimedia object hierarchy descriptions indicative of an organization of said object 

descriptions," of independent claim 33 are not disclosed or suggested by Eleftheriadis. 

The claimed hierarchy, as further described in an embodiment, e.g., at pp. 17-20 

of the present application, relates to an object hierarchy for description of particular video 

objects with varying levels of specificity - for purposes of content description, and not hierarchy 
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of a scene for composing or presenting the scene or playing back streaming multimedia 
information. The claimed object hierarchy processing can produce a "physical hierarchy" and a 
"logical hierarchy/' which relate to the physical location of objects in an image, and a higher 
level hierarchy based on semantic descriptions of the objects in the image, respectively. {See 
Specification, p. 17; Fig. 4). The object hierarchy descriptions may include semantic 
information which is useful for searching a library of multimedia segments, such as "names of 
the picture, the names of persons in the picture, the location where the picture was taken, the 
event that is represented by the picture, the date of the picture, color features... ." (Specification, 
p. 20). 

Accordingly, because Eleftheriadis fails to disclose or suggest at least these 

additional claimed features, Eleftheriadis fails to anticipate independent claims 1,17 and 33. 

Additionally, because all dependent claims contain the foregoing limitations through dependency 

from the independent claims. Appellants respectfully submit that the rejections of record should 

be reversed as to all claims. 

5. Claims 3, 7, 10, 15, 19, 23, 26 and 31 Are Further Not 
Anticipated Because Eleftheriadis Does Not Disclose 
"feature extraction processing" 

Claims 3, 7, 10, 15, 19, 23, 26 and 31 are not anticipated by Eleftheriadis by 
virtue of their dependency from independent claims 1 and 17, and for the reasons discussed 
above. Additionally, these claims include the further limitation of "feature extraction 
processing." {See Specification, p. 26). Regardless of the outcome with respect to the 
independent claims, claims 3, 7, 10, 15, 19, 23, 26 and 31 are independently patentable over 
Eleftheriadis for the additional reason that Eleftheriadis fails to disclose or suggest "feature 
extraction processing." 
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In the Office Action, p. 3, the Examiner asserts that the following portion of 

Eleftheriadis describes this claimed feature: 

An overview of the invention is shown in FIG. 1 for a first illustrative 
embodiment relating to a system using stored files, and FIG. 2 for a 
second illustrative embodiment relating to a system using streaming files. 
In a streaming implementation, the user views incoming audiovisual 
portions as they arrive, which may be temporarily stored in electronic 
memory such as RAM or equivalent memory, but the audiovisual data is 
not necessarily assembled into a fixed file. In either case, an MPEG-4 file 
100 consists of a file header 20 containing global information about the 
AV objects contained within it, followed by an arbitrary number of 
segments 30 containing the AV objects within AL PDUs 60 and BIFS data 
consistent with the MPEG-4 standard known in the art. AV objects 40 can 
represent textual, graphical, video, audio or other information. 
Eleftheriadis, col. 3, lines 14-28. 

Appellants cannot find any reference in the above-cited paragraph to the claimed 
"feature extraction." If the Examiner is referring to, e.g., "AV objects 40 can represent textual, 
graphical, video, audio or other information" as being "features," and that, as a result, this 
somehow equates to the "feature extraction processing" of the claimed subject matter. Appellants 
respectfully disagree. Again, as Eleftheriadis relates to presenting multimedia, it has no need to 
"extract" information about that multimedia it has presented. The cited paragraph of 
Eleftheriadis in particular relates to the playback of an, e.g., MPEG-4-type multimedia file, and 
not to extracting information from multimedia. Accordingly, for at least this additional reason. 
Appellants respectfully submit that the rejections of claims 3, 7, 10, 15, 19, 23, 26 and 31, and 
the claims which depend from them, should be reversed. 
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B. Conclusion 

For at least the reasons indicated above. Appellants respectfully submit that the 
claimed subject matter, as discussed above, is not anticipated by the cited prior art. Reversal of 
the Examiner's rejections of the claims is therefore respectfully requested. 

Respectfully submitted, 



Dated: December 14, 2006 By: 

Paul A. Ragusa 

Patent Office Reg. No. 38,587 

Robert L. Maier 

Patent Office Reg. No. 54,291 

Attorneys for Appellants 
Baker Botts L.L.P. 
30 Rockefeller Plaza 
New York, NY 10112-4498 
Telephone: (212) 408-2500 
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VIIL CLAIMS APPENDIX 

Claims 1-43 are pending in this application: 

1 . (Original) A system for generating a description record from multimedia 
information, comprising: 

(a) at least one multimedia information input interface receiving said 
multimedia information; 

(b) a computer processor, coupled to said at least one multimedia information 
input interface, receiving said multimedia information therefrom, 
processing said multimedia information by performing object extraction 
processing to generate multimedia object descriptions from said 
multimedia information, and processing said generated multimedia object 
descriptions by object hierarchy processing to generate multimedia object 
hierarchy descriptions indicative of an organization of said object 
descriptions, wherein at least one description record including said 
multimedia object descriptions and said multimedia object hierarchy 
descriptions is generated for content embedded within said multimedia 
information; and 

(c) a data storage system, operatively coupled to said processor, for storing 
said at least one description record. 

2. (Original) The system of claim 1 , wherein said multimedia 
information comprises image information, said multimedia object 
descriptions comprise image object descriptions, and said multimedia 
object hierarchy descriptions comprise image object hierarchy 
descriptions. 

3. (Original) The system of claim 2, wherein said object extraction 
processing comprises: 

(a) image segmentation processing to segment each image in said image 
information into regions within said image; and 

(b) feature extraction processing to generate one or more feature descriptions 
for one or more of said regions; 
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whereby said generated object descriptions comprise said one or more 
feature descriptions for one or more of said regions. 

4. (Original) The system of claim 3, wherein said one or more feature 
descriptions are selected from the group consisting of text annotations, 
color, texture, shape, size, and position. 

5. (Original) The system of claim 2, wherein said object hierarchy 
processing comprises physical object hierarchy organization to generate 
physical object hierarchy descriptions of said image object descriptions 
that are based on spatial characteristics of said objects, such that said 
image object hierarchy descriptions comprise physical descriptions. 

6. (Original) The system of claim 5, wherein said object hierarchy 
processing further comprises logical object hierarchy organization to 
generate logical object hierarchy descriptions of said image object 
descriptions that are based on semantic characteristics of said objects, such 
that said image object hierarchy descriptions comprise both physical and 
logical descriptions. 

7. (Original) The system of claim 6, wherein said object extraction 
processing comprises: 

(a) image segmentation processing to segment each image in said image 
information into regions within said image; and 

(b) feature extraction processing to generate object descriptions for one or 
more of said region; 

and wherein said physical hierarchy organization and said logical 
hierarchy organization generate hierarchy descriptions of said object 
descriptions for said one or more of said regions. 

8. (Original) The system of claim 7, further comprising an encoder 
receiving said image object hierarchy descriptions and said image object 
descriptions, and encoding said image object hierarchy descriptions and 
said image object descriptions into encoded description information, 
wherein said data storage system is operative to store said encoded 
description information as said at least one description record. 
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9. (Original) The system of claim 1 , wherein said multimedia 
information comprises video information, said multimedia object 
descriptions comprise video object descriptions including both event 
descriptions and object descriptions, and said multimedia hierarchy 
descriptions comprise video object hierarchy descriptions including both 
event hierarchy descriptions and object hierarchy descriptions. 

10. (Original) The system of claim 9, wherein said object extraction 
processing comprises: 

(a) temporal video segmentation processing to temporally segment said video 
information into one or more video events or groups of video events and 
generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more video 
events or groups of video events into one or more regions, and to generate 
object descriptions for said regions; and 

(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video events, 
and one or more object feature descriptions for said one or more regions; 
wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions. 

1 1 . (Original) The system of claim 10, wherein said one or more event 
feature descriptions are selected from the group consisting of text 
annotations, shot transition, camera motion, time and key frame, and 
wherein said one or more object feature descriptions are selected from the 
group consisting of color, texture, shape, size, position, motion, and time. 

12. (Original) The system of claim 9, wherein said object hierarchy 
processing comprises physical event hierarchy organization to generate 
physical event hierarchy descriptions of said video object descriptions that 
are based on temporal characteristics of said video objects, such that said 
video hierarchy descriptions comprise temporal descriptions. 

13. (Original) The system of claim 12, wherein said object hierarchy 
processing further comprises logical event hierarchy organization to 
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generate logical event hierarchy descriptions of said video object 
descriptions that are based on semantic characteristics of said video 
objects, such that said hierarchy descriptions comprise both temporal and 
logical descriptions. 

14. (Original) The system of claim 13, wherein said object hierarchy 
processing further comprises physical and logical object hierarchy 
extraction processing, receiving said temporal and logical descriptions and 
generating object hierarchy descriptions for video objects embedded 
within said video information, such that said video hierarchy descriptions 
comprise temporal and logical event and object descriptions. 

15. (Original) The system of claim 14, wherein said object extraction 
processing comprises: 

(a) temporal video segmentation processing to temporally segment said video 
information into one or more video events or groups of video events and 
generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more video 
events or groups of video events into one or more regions, and to generate 
object descriptions for said regions; and 

(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video events, 
and one or more object feature descriptions for said one or more regions; 
wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions, and wherein said physical event 
hierarchy organization and said logical event hierarchy organization 
generate hierarchy descriptions from said event feature descriptions, and 
wherein said physical object hierarchy organization and said logical object 
hierarchy organization generate hierarchy descriptions from said object 
feature descriptions. 

16. (Original) The system of claim 15, further comprising an encoder 
receiving said video object hierarchy descriptions and said video object 
descriptions, and encoding said video object hierarchy descriptions and 
said video object descriptions into encoded description information, 
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wherein said data storage system is operative to store said encoded 
description information as said at least one description record. 

1 7. (Original) A method for generating a description record from 
multimedia information, comprising the steps of: 

(a) receiving said multimedia information; 

(b) processing said multimedia information by performing object extraction 
processing to generate multimedia object descriptions from said 
muhimedia information; 

(c) processing said generated multimedia object descriptions by object 
hierarchy processing to generate multimedia object hierarchy descriptions 
indicative of an organization of said object descriptions, wherein at least 
one description record including said multimedia object descriptions and 
said multimedia object hierarchy descriptions is generated for content 
embedded within said multimedia information; and 

(d) storing said at least one description record. 

1 8. (Original) The method of claim 17, wherein said muhimedia 
information comprises image information, said multimedia object 
descriptions comprise image object descriptions, and said multimedia 
object hierarchy descriptions comprise image object hierarchy 
descriptions. 

19. (Previously amended) The method of claim 1 8, wherein said object 
extraction processing step comprises the sub-steps of: 

(a) image segmentation processing to segment each image in said image 
information into regions within said image; and 

(b) feature extraction processing to generate one or more feature descriptions 
for one or more of said regions; 

whereby said generated image object descriptions comprise said one or 
more feature descriptions for one or more of said regions. 
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20. (Original) The method of claim 19, wherein said one or more feature 
descriptions are selected from the group consisting of text annotations, 
color, texture, shape, size, and position. 

2 1 . (Original) The method of claim 1 8, wherein said step of object 
hierarchy processing includes the sub-step of physical object hierarchy 
organization to generate physical object hierarchy descriptions of said 
image object descriptions that are based on spatial characteristics of said 
objects, such that said image hierarchy descriptions comprise physical 
descriptions. 

22. (Original) The method of claim 21, said step of object hierarchy 
processing further includes the sub-step of logical object hierarchy 
organization to generate logical object hierarchy descriptions of said 
image object descriptions that are based on semantic characteristics of said 
objects, such that said image object hierarchy descriptions comprise both 
physical and logical descriptions. 

23. (Original) The method of claim 22, \ wherein said step of object 
extraction processing further includes the sub-steps of; 

(a) image segmentation processing to segment each image in said image 
information into regions within said image; and 

(b) feature extraction processing to generate object descriptions for one or 
more of said region; 

and wherein said physical object hierarchy organization sub-step and said 
logical object hierarchy organization sub-step generate hierarchy 
descriptions of said object descriptions for said one or more of said 
regions. 

24. (Previously presented) The method of claim 1 8, further comprising the 
step of encoding said image object descriptions and said image object 
hierarchy descriptions into encoded description information prior to said 
data storage step. 

25. (Original) The method of claim 17, wherein said multimedia 
information comprises video information, said multimedia object 
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descriptions comprise video object descriptions including both event 
descriptions and object descriptions, and said multimedia hierarchy 
descriptions comprise video object hierarchy descriptions including both 
event hierarchy descriptions and object hierarchy descriptions. 

26. (Original) The method of claim 25, wherein said step of object 
extraction processing comprises the sub-steps of: 

(a) temporal video segmentation processing to temporally segment said video 
information into one or more video events or groups of video events and 
generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more video 
events or groups of video events into one or more regions, and to generate 
object descriptions for said regions; and 

(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video events, 
and one or more object feature descriptions for said one or more regions; 
wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions. 

27. (Original) The method of claim 26, wherein said one or more event 
feature descriptions are selected from the group consisting of text 
annotations, shot transition, camera motion, time and key frame, and 
wherein said one or more object feature descriptions are selected from the 
group consisting of color, texture, shape, size, position, motion, and time. 

28. (Original) The method of claim 25, wherein said step of object 
hierarchy processing includes the sub-step of physical event hierarchy 
organization to generate physical event hierarchy descriptions of said 
video object descriptions that are based on temporal characteristics of said 
video objects, such that said video hierarchy descriptions comprise 
temporal descriptions. 

29. (Original) The method of claim 28, wherein said step of object 
hierarchy processing further includes the sub-step of logical event 
hierarchy organization to generate logical event hierarchy descriptions of 
said video object descriptions that are based on semantic characteristics of 



NY02:567568.4 



A-7 



A32095-PCT.USA - 070050.1520 



said video objects, such that said hierarchy descriptions comprise both 
temporal and logical descriptions. 

30. (Original) The method of claim 29, wherein said step of object 
hierarchy processing further comprises the sub-step physical and logical 
object hierarchy extraction processing, receiving said temporal and logical 
descriptions and generating object hierarchy descriptions for video objects 
embedded within said video information, such that said video hierarchy 
descriptions comprise temporal and logical event and object descriptions.. 

3 1 . (Original) The method of claim 30, wherein said step of object 
extraction processing comprises the sub-steps of: 

(a) temporal video segmentation processing to temporally segment said video 
information into one or more video events or groups of video events and 
generate event descriptions for said video events, 

(b) video object extraction processing to segment said one or more video 
events or groups of video events into one or more regions, and to generate 
object descriptions for said regions; and 

(c) feature extraction processing to generate one or more event feature 
descriptions for said one or more video events or groups of video events, 
and one or more object feature descriptions for said one or more regions; 
wherein said generated video object descriptions include said event feature 
descriptions and said object descriptions, and wherein said physical event 
hierarchy organization and said logical event hierarchy organization 
generate hierarchy descriptions from said event feature descriptions, and 
wherein said physical object hierarchy organization and said logical object 
hierarchy organization generate hierarchy descriptions from said object 
feature descriptions. 

32. (Previously presented) The method of claim 3 1 , further comprising 
the step of encoding said video object descriptions and said video object 
hierarchy descriptions into encoded description information prior to said 
data storage step. 

33. (Previously presented) A computer readable media containing 
digital information with at least one multimedia description record 
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describing multimedia content for corresponding multimedia information, 
the description record comprising: 

(a) one or more multimedia object descriptions, generated by 
performing object extraction processing, said object descriptions 
describing corresponding multimedia objects; 

(b) one or more features characterizing each of said multimedia object 
descriptions; and 

(c) one or more multimedia object hierarchy descriptions indicative of 
an organization of said object descriptions, if any, relating at least a 
portion of said one or more multimedia objects in accordance with one or 
more characteristics. 

34. (Original) The computer readable media of claim 33, wherein said 
multimedia information comprises image information, said multimedia 
objects comprise image objects, said multimedia object descriptions 
comprise image object descriptions, and said multimedia object hierarchy 
descriptions comprise image object hierarchy descriptions. 

35. (Original) The computer readable media of claim 34, wherein said one 
or more features are selected from the group consisting of text annotations, 
color, texture, shape, size, and position. 

36. (Original) The computer readable media of claim 34, wherein said 
image object hierarchy descriptions comprise physical object hierarchy 
descriptions of said image object descriptions based on spatial 
characteristics of said image objects. 

37. (Original) The computer readable media of claim 36, wherein said 
image object hierarchy descriptions further comprises logical object 
hierarchy descriptions of said image object descriptions based on semantic 
characteristics of said image objects. 
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38. (Original) The computer readable media of claim 33, wherein said 
multimedia information comprises video information, said multimedia 
objects comprise events and video objects, said multimedia object 
descriptions comprise video object descriptions including both event 
descriptions and object descriptions, said features comprise video event 
features and video object features, and said multimedia hierarchy 
descriptions comprise video object hierarchy descriptions including both 
event hierarchy descriptions and object hierarchy descriptions. 

39. (Original) The computer readable media of claim 38, wherein said one 
or more event feature descriptions are selected from the group consisting 
of text annotations, shot transition, camera motion, time and key frame, 
and wherein said one or more object feature descriptions are selected from 
the group consisting of color, texture, shape, size, position, motion, and 
time. 

40. (Original) The computer readable media of claim 38, wherein said 
event hierarchy descriptions comprise one or more physical hierarchy 
descriptions of said events based on temporal characteristics. 

41 . (Original) The computer readable media of claim 40, wherein said 
event hierarchy descriptions further comprise one or more logical 
hierarchy descriptions of said events based on semantic characteristics. 

42. (Original) The computer readable media of claim 38, wherein said 
object hierarchy descriptions comprise one or more physical hierarchy 
descriptions of said objects based on temporal characteristics. 

43. (Original) The computer readable media of claim 39, wherein said 
object hierarchy descriptions further comprise one or more logical 
hierarchy descriptions of said objects based on semantic characteristics. 
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IX. EVIDENCE APPENDIX 

None. 
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X. RELATED PROCEEDINGS APPENDIX 

None. 
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